The IdMappingRetrieval package in Bioconductor: Collecting and caching identifier mappings from online sources

نویسنده

  • Alex Lisovich
چکیده

Research which integrates data from multiple data platforms must of course merge on samples processed in parallel on the platforms. However, exploiting the full biological significance of the data depends on merging on the respective features as well. The features have platform-specific biological identifiers, so identifier mapping is critical to this merging. The IdMappingRetrieval package allows initial acquisition of biological identifier mappings (ID maps) from online bioinformatics services, with caching in localdata repositories for subsequent fast retrieval. An ID map is a one-to-many map from one ID type (called the primary key) to another (called the secondary key). The services currently supported are NetAffx and Ensembl. The package employs a unified interface for accessing these services, so that the local repositories will be easy to create, update, and use. Aside from identifier maps themselves, the the service’s complete annotation data sets are also accessible through the same mechanism. . Therefore, although ID mapping is the primary goal, secondarily the package performs as a generic annotation data collection tool if desired. The objects produced by this package are specifically suited for use by the package IdMappingAnalysis, currently in preparation for Bioconductor. The purpose of IdMappingAnalysis is to characterize and compare two or more ID maps. Retrieval is typically subsetted by Affymetrix array.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis

biomaRt is a new Bioconductor package that integrates BioMart data resources with data analysis software in Bioconductor. It can annotate a wide range of gene or gene product identifiers (e.g. Entrez-Gene and Affymetrix probe identifiers) with information such as gene symbol, chromosomal coordinates, Gene Ontology and OMIM annotation. Furthermore biomaRt enables retrieval of genomic sequences a...

متن کامل

The IdMappingAnalysis package in Bioconductor: Critically comparing identifier maps retrieved from bioinformatics annotation resources

With increasing frequency, studies of biological samples include processing on two (or more) high-throughput platforms. Each platform produces a large set of features, each labeled by an identifier. It is one thing to merge the data by sample, simply combining the features on both platforms into a single data set. However, exploiting the full biological significance of the data depends on linki...

متن کامل

lumi: a pipeline for processing Illumina microarray

UNLABELLED Illumina microarray is becoming a popular microarray platform. The BeadArray technology from Illumina makes its preprocessing and quality control different from other microarray technologies. Unfortunately, most other analyses have not taken advantage of the unique properties of the BeadArray system, and have just incorporated preprocessing methods originally designed for Affymetrix ...

متن کامل

ontoCAT: an R package for ontology traversal and search

MOTIVATION There exist few simple and easily accessible methods to integrate ontologies programmatically in the R environment. We present ontoCAT-an R package to access ontologies in widely used standard formats, stored locally in the filesystem or available online. The ontoCAT package supports a number of traversal and search functions on a single ontology, as well as searching for ontology te...

متن کامل

HDTD: analyzing multi-tissue gene expression data

MOTIVATION By collecting multiple samples per subject, researchers can characterize intra-subject variation using physiologically relevant measurements such as gene expression profiling. This can yield important insights into fundamental biological questions ranging from cell type identity to tumour development. For each subject, the data measurements can be written as a matrix with the differe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011